Development of Reliable Aqueous Solubility Models and Their Application in Druglike Analysis

نویسندگان

  • Junmei Wang
  • George Krudy
  • Tingjun Hou
  • Wei Zhang
  • George Holland
  • Xiaojie Xu
چکیده

In this work, two reliable aqueous solubility models, ASMS (aqueous solubility based on molecular surface) and ASMS-LOGP (aqueous solubility based on molecular surface using ClogP as a descriptor), were constructed by using atom type classified solvent accessible surface areas and several molecular descriptors for a diverse data set of 1708 molecules. For ASMS (without using ClogP as a descriptor), the leave-one-out q(2) and root-mean-square error (RMSE) were 0.872 and 0.748 log unit, respectively. ASMS-LOGP was slightly better than ASMS (q(2) = 0.886, RMSE = 0.705). Both models were extensively validated by three cross-validation tests and encouraging predictability was achieved. High throughput aqueous solubility prediction was conducted for a number of data sets extracted from several widely used databases. We found that real drugs are about 20-fold more soluble than the so-called druglike molecules in the ZINC database, which have no violation of Lipinski's "Rule of 5" at all. Specifically, oral drugs are about 16-fold more soluble, while injection drugs are 50-60-fold more soluble. If the criterion of a molecule to be soluble is set to -5 log unit, about 85% of real drugs are predicted as soluble; in contrast only 50% of druglike molecules in ZINC are soluble. We concluded that the two models could be served as a rule in druglike analysis and an efficient filter in prioritizing compound libraries prior to high throughput screenings (HTS).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prediction of aqueous solubility of druglike organic compounds using partial least squares, backpropagation network and support vector machine

Aqueous solubility of drug compounds plays a very important role in drug research and development. In this study, we have collected 225 diverse druglike molecules with accurate aqueous solubility. Three commonly used methods, namely partial least squares (PLS), back-propagation network (BPN) and support vector regression (SVR), were employed to model quantitative structure–property relationship...

متن کامل

Uniting Cheminformatics and Chemical Theory To Predict the Intrinsic Aqueous Solubility of Crystalline Druglike Molecules

We present four models of solution free-energy prediction for druglike molecules utilizing cheminformatics descriptors and theoretically calculated thermodynamic values. We make predictions of solution free energy using physics-based theory alone and using machine learning/quantitative structure-property relationship (QSPR) models. We also develop machine learning models where the theoretical e...

متن کامل

Prediction of boiling point and water solubility of crude oil hydrocarbons using sub-structural molecular fragments method

The quantitative structure–property relationship (QSPR) method is used to develop the correlation between structures of crude oil hydrocarbons (80 compounds) and their boiling point and water solubility. Sub-structural molecular fragments (SMF) calculated from structure alone were used to represent molecular structures. A subset of the calculated fragments selected using stepwise regression (fo...

متن کامل

Prediction of pH-Dependent Aqueous Solubility of Druglike Molecules

In the present work, the Henderson-Hasselbalch (HH) equation has been employed for the development of a tool for the prediction of pH-dependent aqueous solubility of drugs and drug candidates. A new prediction method for the intrinsic solubility was developed, based on artificial neural networks that have been trained on a druglike PHYSPROP subset of 4548 compounds. For the prediction of acid/b...

متن کامل

In silico prediction of aqueous solubility: a multimodel protocol based on chemical similarity.

Aqueous solubility is one of the most important ADMET properties to assess and to optimize during the drug discovery process. At present, accurate prediction of solubility remains very challenging and there is an important need of independent benchmarking of the existing in silico models such as to suggest solutions for their improvement. In this study, we developed a new protocol for improved ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of chemical information and modeling

دوره 47 4  شماره 

صفحات  -

تاریخ انتشار 2007